Bar diagrams - histogram and barchart
The histogram
command is used to display frequency distributions/value distributions for numerical variables in a graphical way where the values are grouped into appropriate intervals and columns are assigned that show the degree of occurrence for each interval. It is possible to control how many columns or how wide the column intervals should be, through options. With a limited number of values, it is recommended to use the discrete
option. This produces a graph that shows one bar per value (rather than intervals).
The barchart
command is used to create bar charts for numerical variables, i.e. graphical display of the values obtained through the summarize
command. It is possible to group the columns by categorical values.
// Histogram and barchart
require no.ssb.fdb:23 as db
create-dataset demography
import db/INNTEKT_WYRKINNT 2020-01-01 as income
import db/INNTEKT_BRUTTOFORM 2020-01-01 as wealth
import db/BEFOLKNING_KJOENN as gender
import db/BEFOLKNING_FOEDSELS_AAR_MND as birthdate
// Generate age per 2020
generate age = 2020 - int(birthdate/100)
// Histogram (frequency distribution)
// This is a way of displaying frequency distributions for metric/continuous variables in a graphical way where the values are grouped into appropriate intervals and bars are assigned that show the degree of occurrence. The bar areas in the diagram add up to 1 as default, but you can override this through options. Through options, you can also choose the division of values yourself (how many columns you want), add a normal distribution curve as a reference, etc
histogram income
histogram income, freq
histogram income, fraction
histogram income, percent
histogram income, normal
histogram income, bin(6) freq
histogram income, width(100000) freq
histogram income, by(gender)
histogram income if income > 100000
// By using the discrete option, one can also create histograms for discrete variables. Then each category will be represented by respective columns
histogram age, discrete
// Barchart
// Such charts are great for presenting statistics for continuous/metric variables in a clear way. One can combine several variables and break down the figures into categorical characteristics (gender, level of education, etc.)
barchart (mean) income, over(gender)
barchart (mean) income wealth, over(gender)